R packages

Required packages: ggthemes, fixest, broom, modelsummary, tidyverse

# load required packages
if (!require("pacman")) install.packages("pacman")
## Loading required package: pacman
pacman::p_load(
  ggthemes, fixest, broom, modelsummary, tidyverse
)

Hands on: Intro to difference-in-differences

Generating fake data

To see how DD works in practice we will first generate a fake dataset where we know the true value of what we want to estimate. This is a good skill to have in order to make sure you understand how methods work, and that your code does what you want it to do. If you know what the result is and your code produces something different, your code is wrong.

To generate fake data we need to model the true data generating process. We will make this exactly the DD model, but with some added noise.

Random numbers

Any time you introduce randomness into your code, you will want to set a random number generator seed so it is reproducible.

# Initialize RNG
set.seed(123456)

Every time you run this notebook you will get the exact same results even though we are drawing “random” numbers.

The model parameters

The model generating our data will be very simple: 1. A time-invariant factor specific to both NY and PA 2. A period-specific factor common across both NY and PA 3. A treatment effect in NY after the fake policy is implemented 4. Some added random noise in outcomes

First we need to define the time-invariant factors.

# Fixed state characteristics: ny has more biodiversity
state_fes <- tribble(~state, ~state_fe,
                     "NY", 88,
                     "PA", 44)

Here we used a tribble, which is just a tibble but where we can write it out element-by-element. This is helpful when you need to initialize small data frames.

Next we need the common, period-specific factors.

# Period characteristics: biodiversity is declining
period_fes <- tribble(~period, ~period_fe,
                     "before", 341,
                     "after", 207)

Last, we need to define the size of the treatment effect.

# True effect of the policy: how many additional species are saved?
treatment_effect <- 55

Now we have defined all the parameters required to generate our data. Next, we need to actually generate the data. This takes a few steps.

Generating the data

First, we need to generate N observations for each combination of before/after and NY/PA. We can do this with crossing.

# All combos of states, periods, and observation numbers
bio_df <- crossing(
  state = c("NY", "PA"),
  period = c("before", "after"),
  obs = 1:500
)
bio_df
## # A tibble: 2,000 x 3
##    state period   obs
##    <chr> <chr>  <int>
##  1 NY    after      1
##  2 NY    after      2
##  3 NY    after      3
##  4 NY    after      4
##  5 NY    after      5
##  6 NY    after      6
##  7 NY    after      7
##  8 NY    after      8
##  9 NY    after      9
## 10 NY    after     10
## # … with 1,990 more rows

Second, we need to bring in the time-invariant, and period-specific factors with join functions.

bio_df <- bio_df %>%
  inner_join(period_fes) %>% 
  inner_join(state_fes)
## Joining, by = "period"
## Joining, by = "state"
bio_df
## # A tibble: 2,000 x 5
##    state period   obs period_fe state_fe
##    <chr> <chr>  <int>     <dbl>    <dbl>
##  1 NY    after      1       207       88
##  2 NY    after      2       207       88
##  3 NY    after      3       207       88
##  4 NY    after      4       207       88
##  5 NY    after      5       207       88
##  6 NY    after      6       207       88
##  7 NY    after      7       207       88
##  8 NY    after      8       207       88
##  9 NY    after      9       207       88
## 10 NY    after     10       207       88
## # … with 1,990 more rows

Third, it will be helpful to define a variable for units that are treated: New York and after.

bio_df <- bio_df %>% 
  mutate(treated = state == "NY" & period == "after",
         period = factor(period, levels = c("before", "after")),
         period = relevel(period, "before"))
bio_df
## # A tibble: 2,000 x 6
##    state period   obs period_fe state_fe treated
##    <chr> <fct>  <int>     <dbl>    <dbl> <lgl>  
##  1 NY    after      1       207       88 TRUE   
##  2 NY    after      2       207       88 TRUE   
##  3 NY    after      3       207       88 TRUE   
##  4 NY    after      4       207       88 TRUE   
##  5 NY    after      5       207       88 TRUE   
##  6 NY    after      6       207       88 TRUE   
##  7 NY    after      7       207       88 TRUE   
##  8 NY    after      8       207       88 TRUE   
##  9 NY    after      9       207       88 TRUE   
## 10 NY    after     10       207       88 TRUE   
## # … with 1,990 more rows

Now we need to generate our data by adding the time-invariant effect state_fe, the period-specific effect period_fe, and some random noise given by rnorm. If the observation is New York after the policy we will also add treatment_effect.

bio_df <- bio_df %>% 
  mutate(
    biodiversity = ifelse(
      treated,
      period_fe + state_fe + treatment_effect + 100*rnorm(n()),
      period_fe + state_fe  + 100*rnorm(n())
    )
  )
bio_df
## # A tibble: 2,000 x 7
##    state period   obs period_fe state_fe treated biodiversity
##    <chr> <fct>  <int>     <dbl>    <dbl> <lgl>          <dbl>
##  1 NY    after      1       207       88 TRUE            433.
##  2 NY    after      2       207       88 TRUE            322.
##  3 NY    after      3       207       88 TRUE            314.
##  4 NY    after      4       207       88 TRUE            359.
##  5 NY    after      5       207       88 TRUE            575.
##  6 NY    after      6       207       88 TRUE            433.
##  7 NY    after      7       207       88 TRUE            481.
##  8 NY    after      8       207       88 TRUE            600.
##  9 NY    after      9       207       88 TRUE            467.
## 10 NY    after     10       207       88 TRUE            307.
## # … with 1,990 more rows

Raw data

Let’s plot the data. New York is in black, Pennsylvania is in orange. We’re using geom_jitter horizontally (height = 0) so we can better see the variation in the data.

# Switch the order of the period variables so before comes first
bio_df$period = factor(bio_df$period, levels = c("before", "after"))

ggplot(bio_df, group = interaction(state, period)) +
  geom_jitter(aes(x = period, y = biodiversity, color = interaction(state), shape = interaction(state), height = 0), size = 3) +
  annotate("text", size = 8, label = "NY", x = 1, y = 700) +
  annotate("text", size = 8, label = "PA", x = 1, y = 100, color = "orange") +
  scale_color_colorblind() +
  theme_minimal() +
  scale_x_discrete(labels = c("Before Treatment", "After Treatment")) + 
  scale_y_continuous(limits = c(50, 700)) +
  labs(
    x = "Time",
    y = "Biodiversity",
    title = "The raw fake data"
  ) +
  theme(
    legend.position = "none",
    title = element_text(size = 24),
    axis.text.x = element_text(size = 24), axis.text.y = element_text(size = 24),
    axis.title.x = element_text(size = 24), axis.title.y = element_text(size = 24),
    panel.grid.minor.x = element_blank(), panel.grid.minor.y = element_blank(),
    panel.grid.major.x = element_blank(),
    axis.line = element_line(colour = "black")
  ) 
## Warning: Ignoring unknown aesthetics: height
## Warning: Removed 13 rows containing missing values (geom_point).

Notice that you can clearly see the decreasing time trend (before is higher than after), and also that NY tends to always have higher biodiversity than PA (black higher than orange). Seeing the effect in just the pure raw data is tricky here without taking means. Later in the slides we will walk through DD step-by-step and see how it works. For now we will just directly go to estimating the effect using DD and the fixest::feols package.

Difference-in-differences

# Outcome ~ treatment + period fixed effect + state fixed effect, data
estimates <- fixest::feols(
  biodiversity ~ treated + period + state, 
  data = bio_df
) 

msummary(estimates, # report estimates in a table
    coef_map = c(
      "statePA" = "Pennsylvania",
      "periodafter" = "After",
      "treatedTRUE" = "New York & After (Treatment Effect)"
    ),
    gof_omit = "R2|AIC|BIC|Log.|FE|Std.|Obs",
    stars = TRUE,
    fmt = "%.2f",
    title = "The effect of the NY policy on biodiversity",
  )
The effect of the NY policy on biodiversity
Model 1
Pennsylvania -47.98***
(6.27)
After -139.26***
(6.27)
New York & After (Treatment Effect) 54.73***
(8.87)
* p < 0.1, ** p < 0.05, *** p < 0.01

Now compare the true values (left) versus the estimates (right). DD correctly recovers the true treatment effect (with some noise)!

coefs <- estimates$coefficients
c(treatment_effect, coefs[2]) # treatment_effect
##             treatedTRUE 
##    55.00000    54.72843
c(period_fes$period_fe[period_fes$period == "after"] - period_fes$period_fe[period_fes$period == "before"], coefs[3]) # period_after
##             periodafter 
##   -134.0000   -139.2645
c(state_fes$state_fe[state_fes$state == "PA"] - state_fes$state_fe[state_fes$state == "NY"], coefs[4]) # statePA
##             statePA 
## -44.00000 -47.98205

DD addresses time-invariant and common period-specific factors

Now let’s look at what happens if take more naive approaches that do not address time-invariant factors, or common-period specific factors. We will look at how the failure to address these things biases our estimates of the effect of the policy in predictable ways.

Suppose we just regressed biodiversity on the existence of the conservation policy.

# Outcome ~ treatment, data
fixest::feols(
  biodiversity ~ treated, 
  data = bio_df
  ) %>%  # use the BLL data frame
  msummary( # report estimates in a table
  coef_map = c(
    "periodafter" = "After",
    "treatedTRUE" = "New York & After (Treatment Effect)"
  ),
  gof_omit = "R2|AIC|BIC|Log.|FE|Std.|Obs",
  stars = TRUE,
  fmt = "%.2f",
  title = "The effect of the NY policy on biodiversity",
  )
The effect of the NY policy on biodiversity
Model 1
New York & After (Treatment Effect) -6.13
(6.23)
* p < 0.1, ** p < 0.05, *** p < 0.01

We underestimate the true effect (and in fact have an estimate with the wrong sign!) because we did not difference out the common decline in biodiversity over time. The estimated effect of the policy in NY is confounded by the common decline in biodiversity across both states. If we took this estimate at face value it says that conservation policies backfire and make biodiversity worse.

What if now we controlled for these common, period-specific factors?

# Outcome ~ treatment + period fixed effects, data
fixest::feols(
  biodiversity ~ treated + period, 
  data = bio_df
  ) %>%  # use the BLL data frame
  msummary( # report estimates in a table
  coef_map = c(
    "periodafter" = "After",
    "treatedTRUE" = "New York & After (Treatment Effect)"
  ),
  gof_omit = "R2|AIC|BIC|Log.|FE|Std.|Obs",
  stars = TRUE,
  fmt = "%.2f",
  title = "The effect of the NY policy on biodiversity",
  )
The effect of the NY policy on biodiversity
Model 1
After -163.26***
(5.51)
New York & After (Treatment Effect) 102.71***
(6.36)
* p < 0.1, ** p < 0.05, *** p < 0.01

Now we overestimate the true effect. We (still) did not difference out the fact that NY tends to have higher levels of biodiversity. The estimated treatment effect is biased upward because NY is just more biodiverse even without the policy. We are overstating the impact of the policy because it happened to be adopted by a state with a lot of biodiversity.

Hands on: Hollingsworth and Rudik (2021)

Now lets work through the HR data to see how they got their results. We will just start with the cleaned data instead of raw. Cleaning the data is a huge chunk of the actual work of doing research (90%+), but takes a lot of code/time and is outside the scope of this class.

Lead poisoning

First, load the lead poisoning data and look at it.

## Read in blood lead data
bll_df <- read.csv("data/hr_2021_bll.csv") %>% as_tibble()
bll_df
## # A tibble: 22,887 x 13
##    state county percent_high_co… ihs_bll  race  year weight unemp_rate
##    <int>  <int>            <dbl>   <dbl> <int> <int>  <dbl>      <dbl>
##  1     1      1            0.752   0.695     0  2015  11.5      0.052 
##  2     1      1            0       0         0  2014   5.39     0.059 
##  3     1      1            0       0         0  2013   4.47     0.062 
##  4     1      1            0       0         0  2012   3.16     0.069 
##  5     1      1            0       0         0  2011   6        0.083 
##  6     1      1            0       0         0  2010  13.4      0.089 
##  7     1      1            1.14    0.979     0  2009  13.2      0.0970
##  8     1      1            0.543   0.520     0  2008  13.6      0.051 
##  9     1      1            0.610   0.577     0  2007  12.8      0.033 
## 10     1      1            4.44    2.20      0  2006   6.71     0.033 
## # … with 22,877 more rows, and 5 more variables: median_income <int>,
## #   percent_non_white <dbl>, lead_emitted <dbl>, ap <int>, state_county <int>

What are the key variables?

  • state: state fips code
  • county: county fips code
  • state_county: full fips code
  • year: the year
  • ihs_bll: inverse hyperbolic sine of the lead poisoning rate
  • race: categorical variable equal to 2 if a treatment county, 1 if a border county, 0 if a control county

The raw data

First, let’s plot the raw data. This is something you should generally always do. It lets you see what is actually happening, whether you made data cleaning errors, etc. If what you’re trying to estimate clearly pops outs in the raw data that’s a good sign.

Here we will be plotting the raw data with two changes: 1. Only plot for 2005-2009. After 2009 some counties stop reporting lead poisoning, so the composition of the sample changes, making the comparison tricky. 2. Plot the weighted mean lead poisoning rate where we weight by the number of children tested. First, we want to plot a mean so we can more clearly see any trends and differences across treated and control counties. Second, we want a weighted mean because counties with more testing report estimated rates that less noisy measures of the true rate. We want these counties to count more toward our nationwide average.

bll_df %>% 
  group_by(race, year) %>% 
  summarise(percent_high_conditional_tested = weighted.mean(percent_high_conditional_tested, weight, na.rm = T)) %>% 
  filter(year >= 2005 & year <= 2009) %>% 
  ggplot(aes(x = year, y = percent_high_conditional_tested)) +
    geom_vline(xintercept = 2006.5, color = "grey55", linetype = "dashed") +
    geom_point(aes(shape = as.factor(race), color = as.factor(race)), size = 5) + 
    # everything below this line is to just make the plot look cleaner/fancier
    annotate("text", size = 4, label = "Treated", x = 2005.25, y = 2.85) +
    annotate("text", size = 4, label = "Border", x = 2005.25, y = 2.5) +
    annotate("text", size = 4, label = "Control", x = 2005.25, y = 1.4) +
    annotate("text", size = 6, label = "Leaded Fuel", x = 2005.50, y = 0.25) +
    annotate("text", size = 6, label = "Unleaded Fuel", x = 2008, y = 0.25) +
    theme_minimal() +
    scale_color_colorblind() +
    labs(
      x = "Year",
      y = "Percent Lead Poisoned",
      title = "Average lead poisoning rates by type (balanced panel)"
    ) +
    theme(
      legend.position = "none",
      title = element_text(size = 14),
      axis.text.x = element_text(size = 14), axis.text.y = element_text(size = 14),
      axis.title.x = element_text(size = 14), axis.title.y = element_text(size = 14),
      panel.grid.minor.x = element_blank(), panel.grid.minor.y = element_blank(),
      panel.grid.major.x = element_blank(),
      axis.line = element_line(colour = "black")
    ) 
## `summarise()` regrouping output by 'race' (override with `.groups` argument)

Before the policy, we have parallel-ish pre-trends across the three groups. This is a good sign for the validity of the DD approach. In the post period, we see that treated and border county lead poisoning rates drop relative to control counties, and then stay on the same trend thereafter. Treated counties drop more than border counties. All of this indicates that the removal of lead in 2007 had an effect on lead poisoning rates, and that being close to a lead source is worse (good sanity check).

Difference-in-differences

Now let’s actually estimate the effect it had on lead poisoning rates. HR uses a DD design in their appendix. Let’s estimate the effect corresponding to their Table A8 Column 1. We will regress the inverse hyperbolic sine (approximation to log when you have a lot to zeros) on: - County fixed effects (county time-invariant characteristics) - Year fixed effects (common period-specific characteristics) - County type by period (as.factor(race)*I(year < 2007)), this is the treatment variable

## Difference-in-differences estimate
fixest::feols(
  ihs_bll ~ as.factor(race)*I(year < 2007) | state_county + year, 
  weights = ~weight, # Weight by number of children tested
  data = bll_df) %>%  # use the BLL data frame
  msummary( # report estimates in a table
  cluster = c("state_county"),
  coef_map = c(
    "as.factor(race)2:I(year < 2007)TRUE" = "Treated",
    "as.factor(race)1:I(year < 2007)TRUE" = "Border"
  ),
  gof_omit = "R2|AIC|BIC|Log.|FE|Std.|Obs",
  stars = TRUE,
  fmt = "%.2f",
  title = "The effect of being in a treated or border county on lead poisoning rates",
  notes = list("Robust standard errors are clustered at the county level.")
)
## The variable 'I(year < 2007)TRUE' has been removed because of collinearity (see $collin.var).
The effect of being in a treated or border county on lead poisoning rates
Model 1
Treated 0.22**
(0.10)
Border 0.09*
(0.05)
Robust standard errors are clustered at the county level.
* p < 0.1, ** p < 0.05, *** p < 0.01

What do our estimates tell us? That counties with leaded racing have higher rates of lead poisoning than border counties (0.22 > 0.09), when have higher rates than control counties (0.09 > 0). There is enough of a treatment of lead from racing for it to show up in people’s blood tests.

Event study

With raw data, you need to hone down to a balanced panel. If you actually estimate the event study, you can use the full unbalanced panel dataset. Now let’s estimate the event study. The event study regression is: \[\text{ihs(lead poisoned})_{cy} = \sum_{y \neq 2007}\beta^b_y 1(\text{race in border county})_{cy} + \sum_{y \neq 2007}\beta^r_y 1(\text{race in county})_{cy} + \text{controls} + \text{year fixed effects} + \text{county fixed effects}\]

In an event study, we estimate effects of being in the treatment group in every year relative to some omitted year (here 2007). Event studies make for nice plots so we are going to skip the table and go straight to plotting the effects.

########################################
## Event study

# Need to factor year so we can omit 2007
bll_df$year <- factor(bll_df$year, ordered = FALSE)
bll_df$year <- relevel(bll_df$year, 3)

# Estimate event study
event_study <- fixest::feols(
  ihs_bll ~ as.factor(race)*year + 
    as.factor(state)*year + 
    unemp_rate + median_income + percent_non_white + lead_emitted + ap | state_county + year, 
  weights = ~weight,
  data = bll_df) %>% 
  tidy(cluster = "state") %>% 
  filter(str_detect(term, "^as.factor\\(race\\)2:year")) %>% 
  add_row(estimate = 0, std.error = 0, .after = 2) %>% 
  mutate(year = row_number() + 2004)
## Variables 'year2005', 'year2006' and 98 others have been removed because of collinearity (see $collin.var).
# Plot event study
ggplot(data = event_study, aes(x = year, y = estimate)) +
  geom_hline(yintercept = 0, size = 0.5) +
  geom_vline(xintercept = 2006.5, color = "grey55", linetype = "dashed") +
  geom_ribbon(aes(ymin = estimate - 1.96*std.error, ymax = estimate + 1.96*std.error), 
              fill = "darkslateblue", alpha = 0.4) +
  geom_line(size = 1) +
  geom_point(size = 5) +
  annotate("text", size = 4, label = "Leaded Fuel", x = 2005.50, y = -0.26) +
  annotate("text", size = 4, label = "Unleaded Fuel", x = 2012, y = -0.26) +
  theme_minimal() +
  scale_color_colorblind() +
  scale_x_continuous(breaks = seq(2005, 2015, 2)) +
  labs(
    x = "Year",
    y = "Percent Lead Poisoned",
    title = "Percent lead poisoned relative to 2007 (first unleaded year)"
  ) +
  theme(
    legend.position = "none",
    title = element_text(size = 14),
    axis.text.x = element_text(size = 14), axis.text.y = element_text(size = 14),
    axis.title.x = element_text(size = 14), axis.title.y = element_text(size = 14),
    panel.grid.minor.x = element_blank(), panel.grid.minor.y = element_blank(),
    panel.grid.major.x = element_blank(),
    axis.line = element_line(colour = "black")
  )

What is the event study telling us? That lead poisoning rates were higher in 2005 and 2006 relative to 2007 in race counties. Also that the trend in race counties was relatively flat compared to control counties (2005 and 2006 estimates are about the same, also there were a few unleaded races in 2006 so a slight dip is expected). This is our parallel pre-trends assumption. Once lead was removed in 2007, the effect of being in a treatment county drops. The trend after 2007 is flat: control and treatment counties are on parallel trends in the post-period.

One drawback with the lead poisoning data is we only have 2 pre-periods. This makes it hard to see whether we do have parallel pre-trends.

Elderly mortality

Now let’s switch to looking at all-cause elderly mortality. First, load the data.

########################################
## Read in mortality data
mortality_df <- read.csv("data/hr_2021_mortality.csv") %>% as_tibble()
mortality_df
## # A tibble: 58,107 x 13
##    state county age_adjusted_ra…  race  year unemp_rate median_income
##    <int>  <int>            <dbl> <int> <int>      <dbl>         <dbl>
##  1     1      1            4856.     0  2017    NA                 NA
##  2     1      1            4967.     0  2016     0.051          54487
##  3     1      1            5118.     0  2015     0.052          56580
##  4     1      1            4651.     0  2014     0.059          54366
##  5     1      1            5460      0  2013     0.062          51868
##  6     1      1            5863.     0  2012     0.069          51441
##  7     1      1            4920.     0  2011     0.083          48863
##  8     1      1            5154.     0  2010     0.089          53049
##  9     1      1            4932.     0  2009     0.0970         53081
## 10     1      1            5705.     0  2008     0.051          51622
## # … with 58,097 more rows, and 6 more variables: percent_non_white <dbl>,
## #   lead_emitted <dbl>, ap <int>, pop_total <int>, weight <dbl>,
## #   state_county <int>

What are the key variables?

  • state: state fips code
  • county: county fips code
  • state_county: full fips code
  • year: the year
  • age_adjusted_rate: age-adjusted mortality rate
  • race: categorical variable equal to 2 if a treatment county, 1 if a border county, 0 if a control county

The paper plots the raw data but it’s kind of hard to see what’s going on because mortality rates are declining so quickly. We are going to jump right into the DD and event studies. Feel free to try to reproduce their Figure 6 raw data plot using the same approach above.

## Difference-in-differences estimate
fixest::feols(
  age_adjusted_rate ~ as.factor(race)*I(year < 2007) | state_county + year, 
  weights = ~weight, # Weight by number of children tested
  data = mortality_df) %>%  # use the BLL data frame
  msummary( # report estimates in a table
  cluster = c("state_county"),
  coef_map = c(
    "as.factor(race)2:I(year < 2007)TRUE" = "Treated",
    "as.factor(race)1:I(year < 2007)TRUE" = "Border"
  ),
  gof_omit = "R2|AIC|BIC|Log.|FE|Std.|Obs",
  stars = TRUE,
  fmt = "%.2f",
  title = "The effect of being in a treated or border county on elderly mortality",
  notes = list("Robust standard errors are clustered at the county level.")
)
## NOTES: 1,870 observations removed because of 0-weight.
##        2,212 observations removed because of NA values (Breakup: LHS: 2,212, RHS: 0, Fixed-effects: 0, Weights: 0).
## The variable 'I(year < 2007)TRUE' has been removed because of collinearity (see $collin.var).
The effect of being in a treated or border county on elderly mortality
Model 1
Treated 121.35***
(31.90)
Border 69.50***
(16.47)
Robust standard errors are clustered at the county level.
* p < 0.1, ** p < 0.05, *** p < 0.01

Being exposed to lead in-county raises mortality rates by 121 per 100,000 elderly, and being exposed to lead in a bordering county only raises it by 70 per 100,000. Now let’s plot the event study similarly to how we did lead poisoning.

# Need to factor year so we can omit 2007
mortality_df$year <- factor(mortality_df$year, ordered = FALSE)
mortality_df$year <- relevel(mortality_df$year, 9)

# Estimate event study
event_study <- fixest::feols(
  age_adjusted_rate ~ as.factor(race)*year + 
    as.factor(state)*year + 
    unemp_rate + median_income + percent_non_white + lead_emitted + ap | state_county + year, 
  weights = ~weight,
  data = mortality_df) %>% 
  broom::tidy(cluster = "state") %>% 
  filter(str_detect(term, "^as.factor\\(race\\)2:year")) %>% 
  add_row(estimate = 0, std.error = 0, .after = 8) %>% 
  mutate(year = row_number() + 1998)
## NOTES: 1,870 observations removed because of 0-weight.
##        4,047 observations removed because of NA values (Breakup: LHS: 2,212, RHS: 1,885, Fixed-effects: 0, Weights: 0).
## Variables 'year1999', 'year2000' and 65 others have been removed because of collinearity (see $collin.var).
# Plot event study
ggplot(data = event_study, aes(x = year, y = estimate)) +
  geom_hline(yintercept = 0, size = 0.5) +
  geom_vline(xintercept = 2006.5, color = "grey55", linetype = "dashed") +
  geom_ribbon(aes(ymin = estimate - 1.96*std.error, ymax = estimate + 1.96*std.error), 
              fill = "darkslateblue", alpha = 0.4) +
  geom_line(size = 1) +
  geom_point(size = 5) +
  annotate("text", size = 4, label = "Leaded Fuel", x = 2002, y = -200) +
  annotate("text", size = 4, label = "Unleaded Fuel", x = 2012, y = -200) +
  theme_minimal() +
  scale_color_colorblind() +
  scale_x_continuous(breaks = seq(1999, 2015, 2)) +
  labs(
    x = "Year",
    y = "Age-Adjusted Mortality Rate (deaths per 100,000)",
    title = "Mortality rate relative to 2007 (first unleaded year)"
  ) +
  theme(
    legend.position = "none",
    title = element_text(size = 14),
    axis.text.x = element_text(size = 14), axis.text.y = element_text(size = 14),
    axis.title.x = element_text(size = 14), axis.title.y = element_text(size = 14),
    panel.grid.minor.x = element_blank(), panel.grid.minor.y = element_blank(),
    panel.grid.major.x = element_blank(),
    axis.line = element_line(colour = "black")
  )

The event study plots the effect of being in a treated county relative to a control county by year. One of the nice benefits of the mortality data is that there is a longer pre-period: we can clearly see that there are parallel pre-trends. After 2007, there is a drop in the trend: mortality rates declined in treatment counties relative to control counties.